[fix](partition_prune) Move the pruning of predicates that are always true after partition pruning into the PlanPostProcessor#63111
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
1 similar comment
|
run buildall |
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 29511 ms |
TPC-DS: Total hot run time: 170534 ms |
|
run buildall |
TPC-H: Total hot run time: 29722 ms |
TPC-DS: Total hot run time: 170733 ms |
FE Regression Coverage ReportIncrement line coverage |
|
run p0 |
|
I agree with the core idea: partition-prunable predicates should not be removed before MV rewrite, otherwise MV matching may see an incomplete query predicate set and produce an unsafe rewrite. One concern is where the deferred-removal state is stored. The current implementation records
Putting that proof in A cleaner structure may be to attach this information to the scan itself, e.g. class PartitionPruneInfo {
Set<Long> provedPartitionIds;
List<Slot> partitionSlots;
Set<Expression> prunableConjuncts;
}Then the post processor can simply inspect scan.getPartitionPruneInfo() and remove a conjunct only when: partitionPruneInfo.getProvedPartitionIds().containsAll(scan.getSelectedPartitionIds())This keeps the invariant local to the scan and avoids using StatementContext as a side table for plan-node-specific proof. |
…nstead of StatementContext ### What problem does this PR solve? Problem Summary: Move PartitionPrunablePredicate info from StatementContext (keyed by TableIdentifier) onto LogicalOlapScan / PhysicalOlapScan so it follows the scan through with*/deepCopy and is included in equals. Also propagate the info onto the MV scan rewritten by SyncMaterializationContext#getScanPlan so the post-processor can still strip the deferred conjuncts after sync MV rewrite. ### Release note None ### Check List (For Author) - Test: Regression test (prune_predicates_mv_test.groovy) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
One olap scan can carry at most one PartitionPrunablePredicate, so use Optional instead of Set. Simplifies merge in PruneOlapScanPartition and iteration in PrunePartitionPredicate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d817a84 to
4c371b5
Compare
|
run buildall |
TPC-H: Total hot run time: 29486 ms |
TPC-DS: Total hot run time: 169360 ms |
FE UT Coverage ReportIncrement line coverage |
|
run feut |
FE Regression Coverage ReportIncrement line coverage |
FE UT Coverage ReportIncrement line coverage |
|
/review |
| * maps them onto the actual scan's output slots by column name before | ||
| * performing the conjunct removal. | ||
| */ | ||
| public class PartitionPrunablePredicate { |
There was a problem hiding this comment.
maybe this file should be put into this package.
There was a problem hiding this comment.
Review summary: no additional blocking issues found. The change matches the stated goal of deferring partition-pruned predicate removal until after MV rewrite, and the new regression coverage exercises non-partitioned async MV, partitioned async MV, and sync MV rewrite cases. Existing inline package-placement feedback for PartitionPrunablePredicate was treated as already-known and not duplicated.
Critical checkpoints:
- Goal/test: the PR addresses the MV wrong-result risk caused by removing partition predicates before rewrite; regression coverage was added for the relevant MV paths.
- Scope: the implementation is focused on carrying prune metadata through scan nodes and consuming it in a post-processor.
- Concurrency/lifecycle: no new shared mutable state, locks, async lifecycle, or static initialization risks identified.
- Config/compatibility: no new config, storage format, or FE-BE protocol compatibility concern identified.
- Parallel paths: async MV, partitioned MV, sync MV, and logical-to-physical scan propagation were considered; no distinct missing path found during review.
- Conditional checks: skipPrunePredicate/delete behavior is preserved, and removal is guarded by partition-id containment plus slot remapping.
- Tests/results: new regression .out appears deterministic with order_qt where needed; I did not run the suite in this review environment.
- Observability/transactions/data writes: not applicable to this optimizer-only change.
- Performance: no blocking concern found; deferring removal may keep redundant predicates in the memo longer, but the post-processor still removes supported final physical filter-over-scan cases.
User focus: no additional user-provided review focus was specified.
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 29585 ms |
TPC-DS: Total hot run time: 171194 ms |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
FE UT Coverage ReportIncrement line coverage |
|
/review |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
Code review completed. I did not find a critical/blocking issue to request changes.
Critical checkpoint conclusions:
- Goal and tests: The PR defers removal of partition-prunable predicates until after MV rewrite and carries the necessary metadata through logical/physical OLAP scans. The added regression suite covers async MV, partitioned MV, and sync MV alias cases; unit coverage for
PartitionPruner.pruneWithResultremains focused on predicate-result metadata. - Scope and focus: The implementation is reasonably focused on partition predicate pruning/MV rewrite interaction. The existing package-location review thread was already addressed by moving
PartitionPrunablePredicatetotrees.plans, so I did not repeat it. - Concurrency/lifecycle: The changes are optimizer plan metadata and post-processing only; I did not identify new shared mutable state, lock ordering, or lifecycle hazards.
- Configuration/compatibility: No new configs, storage formats, Thrift/protocol fields, or rolling-upgrade compatibility concerns were introduced.
- Parallel paths: OLAP scan logical-to-physical propagation and sync MV scan propagation are covered. External file scan partition pruning is not part of this behavior and remains separate.
- Conditional checks:
skipPrunePredicateand delete-statement exclusions are preserved both when recording and when removing deferred predicates. The post-processor also verifies that recorded selected partitions still cover the final scan partitions before removing predicates. - Test coverage/results: Regression output is present and deterministic via
order_qt; no additional user-provided focus points were supplied. I did not run the tests in this review runner. - Observability: No new observability requirement found; this is an optimizer rewrite/post-process change.
- Transaction/persistence/data writes: Not applicable.
- FE/BE variable passing: Not applicable.
- Performance: The added post-processor work is limited to physical filters directly over OLAP scans with recorded metadata; no obvious hot-path anti-pattern was found.
User focus: .opencode-review.yiiWoK/review_focus.txt says no additional focus was provided; no extra focus-specific issue was found.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #57169
Problem Summary:
After partition pruning, predicates that are evaluated to constant true are removed as an optimization (introduced by #57169). However, this introduced a bug in materialized view rewriting: when such predicates are removed, they are not compensated above the rewritten materialized view. Since the materialized view itself is not partitioned, this leads to incorrect query results.
This PR fixes the issue by moving the removal of constant-true predicates to the planPostProcessors phase. This ensures that the predicate removal does not interfere with materialized view rewriting, preserving correctness while still retaining the optimization benefit.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)